Learning CPG-based Biped Locomotion with a Policy Gradient Method: Application to a Humanoid Robot
نویسندگان
چکیده
In this paper we describe a learning framework for a central pattern generator (CPG)-based biped locomotion controller using a policy gradient method. Our goals in this study are to achieve CPG-based biped walking with a 3D hardware humanoid and to develop an efficient learning algorithm with CPG by reducing the dimensionality of the state space used for learning. We demonstrate that an appropriate feedback controller can be acquired within a few thousand The International Journal of Robotics Research Vol. 27, No. 2, February 2008, pp. 213–228 DOI: 10.1177/0278364907084980 c SAGE Publications 2008 Los Angeles, London, New Delhi and Singapore trials by numerical simulations and the controller obtained in numerical simulation achieves stable walking with a physical robot in the real world. Numerical simulations and hardware experiments evaluate the walking velocity and stability. The results suggest that the learning algorithm is capable of adapting to environmental changes. Furthermore, we present an online learning scheme with an initial policy for a hardware robot to improve the controller within 200 iter-
منابع مشابه
Learning CPG Sensory Feedback with Policy Gradient for Biped Locomotion for a Full-Body Humanoid
This paper describes a learning framework for a central pattern generator based biped locomotion controller using a policy gradient method. Our goals in this study are to achieve biped walking with a 3D hardware humanoid, and to develop an efficient learning algorithm with CPG by reducing the dimensionality of the state space used for learning. We demonstrate that an appropriate feedback contro...
متن کاملReinforcement Learning for a CPG-driven Biped Robot
Animal’s rhythmic movements such as locomotion are considered to be controlled by neural circuits called central pattern generators (CPGs). This article presents a reinforcement learning (RL) method for a CPG controller, which is inspired by the control mechanism of animals. Because the CPG controller is an instance of recurrent neural networks, a naive application of RL involves difficulties. ...
متن کاملReinforcement Learning for CPG-Driven Biped Robot
Animal’s rhythmic movements such as locomotion are considered to be controlled by neural circuits called central pattern generators (CPGs). This article presents a reinforcement learning (RL) method for a CPG controller, which is inspired by the control mechanism of animals. Because the CPG controller is an instance of recurrent neural networks, a naive application of RL involves difficulties. ...
متن کاملOptimized Joint Trajectory Model with Customized Genetic Algorithm for Biped Robot Walk
Biped robot locomotion is one of the active research areas in robotics. In this area, real-time stable walking with proper speed is one of the main challenges that needs to be overcome. Central Pattern Generators (CPG) as one of the biological gait generation models, can produce complex nonlinear oscillation as a pattern for walking. In this paper, we propose a model for a biped robot joint tra...
متن کاملDynamic Control Algorithm for Biped Walking Based on Policy Gradient Fuzzy Reinforcement Learning
This paper presents a novel dynamic control approach to acquire biped walking of humanoid robots focussed on policy gradient reinforcement learning with fuzzy evaluative feedback . The proposed structure of controller involves two feedback loops: conventional computed torque controller including impact-force controller and reinforcement learning computed torque controller. Reinforcement learnin...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- I. J. Robotics Res.
دوره 27 شماره
صفحات -
تاریخ انتشار 2008